Improving Compressed Counting
نویسنده
چکیده
Compressed Counting (CC) [22] was recently proposed for estimating the αth frequency moments of data streams, where 0 < α ≤ 2. CC can be used for estimating Shannon entropy, which can be approximated by certain functions of the αth frequency moments as α → 1. Monitoring Shannon entropy for anomaly detection (e.g., DDoS attacks) in large networks is an important task. This paper presents a new algorithm for improving CC. The improvement is most substantial when α → 1−. For example, when α = 0.99, the new algorithm reduces the estimation variance roughly by 100-fold. This new algorithm would make CC considerably more practical for estimating Shannon entropy. Furthermore, the new algorithm is statistically optimal when α = 0.5.
منابع مشابه
Sparse Recovery with Very Sparse Compressed Counting
Compressed1 sensing (sparse signal recovery) often encounters nonnegative data (e.g., images). Recently [11] developed the methodology of using (dense) Compressed Counting for recovering nonnegative Ksparse signals. In this paper, we adopt very sparse Compressed Counting for nonnegative signal recovery. Our design matrix is sampled from a maximally-skewed α-stable distribution (0 < α < 1), and ...
متن کاملOptimal Trade-Offs for Succinct String Indexes
Let s be a string whose symbols are solely available through access(i), a read-only operation that probes s and returns the symbol at position i in s. Many compressed data structures for strings, trees, and graphs, require two kinds of queries on s: select(c, j), returning the position in s containing the jth occurrence of c, and rank(c, p), counting how many occurrences of c are found in the f...
متن کاملThe Optimal Quantile Estimator for Compressed Counting
Abstract Compressed Counting (CC) was recently proposed for very efficiently computing the (approximate) αth frequency moments of data streams, where 0 < α ≤ 2. Several estimators were reported including the geometric mean estimator, the harmonic mean estimator, the optimal power estimator, etc. The geometric mean estimator is particularly interesting for theoretical purposes. For example, when...
متن کاملPost-Operative Time Effects after Sciatic Nerve Crush on the Number of Alpha Motoneurons, Using a Sterological Counting Method (Disector)
There are extensive evidences that show axonal processes of the nervous system (peripheral and/or central) may be degenerated after nerve injuries. Wallerian degeneration and chromatolysis are the most conspicuous phenomena that occur in response to injuries. In this research, the effects of post-operative time following sciatic nerve crush on the number of spinal motoneurons were investigated....
متن کاملCompressed counting
Abstract Counting is a fundamental operation. For example, counting the αth frequency moment, F(α) = ∑D i=1 At[i] , of a streaming signal At (where t denotes time), has been an active area of research, in theoretical computer science, databases, and data mining. When α = 1, the task (i.e., counting the sum) can be accomplished using a counter. When α 6= 1, however, it becomes non-trivial to des...
متن کامل